A Case for Partitioned Bloom Filters

نویسندگان

چکیده

In a partitioned Bloom Filter (PBF) the bit vector is split into disjoint parts, one per hash function. Contrary to hardware designs, where they prevail, software implementations mostly ignore PBFs, considering them worse than standard filters (SBF), due slightly larger false positive rate (FPR). this paper, by performing an in-depth analysis, first we show that FPR advantage of SBFs smaller thought; more importantly, deriving per-element FPR, have weak spots in domain: elements test as positives much frequently expected. This relevant scenarios element tested against many filters. Moreover, are prone exhibit extremely if naive double hashing used, something occurring mainstream libraries. PBFs uniform distribution over domain, with no spots, even using hashing. Finally, survey beyond set membership testing, identifying advantages having designs SIMD techniques, for filter size reduction, disjointness, and duplicate detection streams. better, should replace SBFs, general purpose libraries base novel designs.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bloom filters

Bloom filters are used for answering queries on set membership. In this data structure, the whole element is not stored at the hashed address. Only a few bits are set in an array. Given a set S of cardinality n, we store it in an array of m bits using k hash functions h1(), . . . , hk(). Initially, all the cells in the array are set to 0. Then, for each element in the set, x ∈ S, for each 1 ≤ i...

متن کامل

Bloom Filters for Filesystem Forensics

Digital forensics investigations become more time consuming as the amount of data to be investigated grows. Secular growth trends between hard drive and memory capacity just exacerbate the problem. Bloom filters are space-efficient, probabilistic data structures that can represent data sets with quantifiable false positive rates that have the potential to alleviate the problem by reducing space...

متن کامل

Distance-Sensitive Bloom Filters

A Bloom filter is a space-efficient data structure that answers set membership queries with some chance of a false positive. We introduce the problem of designing generalizations of Bloom filters designed to answer queries of the form, “Is x close to an element of S?” where closeness is measured under a suitable metric. Such a data structure would have several natural applications in networking...

متن کامل

Sliding Bloom Filters

A Bloom filter is a method for reducing the space (memory) required for representing a set by allowing a small error probability. In this paper we consider a Sliding Bloom Filter: a data structure that, given a stream of elements, supports membership queries of the set of the last n elements (a sliding window), while allowing a small error probability and a slackness parameter. The problem of s...

متن کامل

Cryptographically Secure Bloom-Filters

In this paper, we propose a privacy-preserving variant of Bloom-filters. The Bloom-filter has many applications such as hash-based IP-traceback systems and Web cache sharing. In some of those applications, equipping the Bloom-filter with the privacy-preserving mechanism is crucial for the deployment. In this paper, we propose a cryptographically secure privacy-preserving Bloom-filter protocol. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Computers

سال: 2023

ISSN: ['1557-9956', '2326-3814', '0018-9340']

DOI: https://doi.org/10.1109/tc.2022.3218995